Development of soft computing models for data mining
نویسندگان
چکیده
The increasing amount and complexity of today's data available in science, business, industry and many other areas creates an urgent need to accelerate discovery of knowledge in large databases . Such data can provide a rich resource for knowledge discovery and decision support. To understand, analyze and eventuall y use this data, a multidisciplinary approach called data mining has been proposed. Technically, data mining is the process of finding correlation or patterns among dozens of fields in large relational databases. Pattern classification is one particular category of data mining, which enables the discovery of knowledge from very large databases (VLDB). In this paper, mining the database through pattern classification has been done by utili zing two important mining tools called K-Nearest Neighbour algorithm and Decision trees. The K-Nearest Neighbour (K-NN) is the popularly used conventional statist ical approach for data mining. K-NN is a technique that classifies each record in a data set based on a combination of the classes of K-records most simi lar to it in a historical data set. The fuzzy version of K-NN, crisp and fuzzy versions of nearest prototype classifiers ha ve also been proposed. Decision tree is one of the best machine learning approaches for data mining. A decision tree is a predictive model that as its name implies, can be viewed as a tree. Briefly, decision trees are tree shaped structures that represent sets of decisions. These decisions generate rules for classification of a data set. Classification and Regression Tree (CART), 103 are the two decision tree methods used in this paper. The classification rules have been ext rac ted in the form of IF THEN rules . The performance analysis of K-NN methods and tree-based classifiers has been done. The proposed methods have been tested on three applications such as land sat imagery, letter image recognition and optical recognition of hand written digits data. The simulation algorithms have been implemented using C++ under UNIX plat form .
منابع مشابه
Application of non-linear regression and soft computing techniques for modeling process of pollutant adsorption from industrial wastewaters
The process of pollutant adsorption from industrial wastewaters is a multivariate problem. This process is affected by many factors including the contact time (T), pH, adsorbent weight (m), and solution concentration (ppm). The main target of this work is to model and evaluate the process of pollutant adsorption from industrial wastewaters using the non-linear multivariate regression and intell...
متن کاملApplication of Soft Computing Methods for the Estimation of Roadheader Performance from Schmidt Hammer Rebound Values
Estimation of roadheader performance is one of the main topics in determining the economics of underground excavation projects. The poor performance estimation of roadheader scan leads to costly contractual claims. In this paper, the application of soft computing methods for data analysis called adaptive neuro-fuzzy inference system- subtractive clustering method (ANFIS-SCM) and artificial neu...
متن کاملUsing the Reaction Delay as the Driver Effects in the Development of Car-Following Models
Car-following models, as the most popular microscopic traffic flow modeling, is increasingly being used by transportation experts to evaluate new Intelligent Transportation System (ITS) applications. A number of factors including individual differences of age, gender, and risk-taking behavior, have been found to influence car-following behavior. This paper presents a novel idea to calculate ...
متن کاملUtilization of Soft Computing for Evaluating the Performance of Stone Sawing Machines, Iranian Quarries
The escalating construction industry has led to a drastic increase in the dimension stone demand in the construction, mining and industry sectors. Assessment and investigation of mining projects and stone processing plants such as sawing machines is necessary to manage and respond to the sawing performance; hence, the soft computing techniques were considered as a challenging task due to stocha...
متن کاملPerformance evaluation of chain saw machines for dimensional stones using feasibility of neural network models
Prediction of the production rate of the cutting dimensional stone process is crucial, especially when chain saw machines are used. The cutting dimensional rock process is generally a complex issue with numerous effective factors including variable and unreliable conditions of the rocks and cutting machines. The Group Method of Data Handling (GMDH) type of neural network and Radial Basis Functi...
متن کاملEfficient Data Mining with Evolutionary Algorithms for Cloud Computing Application
With the rapid development of the internet, the amount of information and data which are produced, are extremely massive. Hence, client will be confused with huge amount of data, and it is difficult to understand which ones are useful. Data mining can overcome this problem. While data mining is using on cloud computing, it is reducing time of processing, energy usage and costs. As the speed of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012